How Good are Typological Distances for Determining Genealogical Relationships among Languages?

نویسندگان

  • Taraka Rama
  • Prasanth Kolachina
چکیده

The recent availability of typological databases such as World Atlas of Language Structures (WALS) has spurred investigations regarding their utility for language classification, the stability of typological features in genetic linguistics and typological universals across the language families of the world. Existing work on building NLP resources such as parallel corpora, treebanks for under-resourced languages has a lot to gain by taking into consideration insights about inter-language relationships. Since Yarowsky et al. (2001), there have been a number of attempts to create resources for resource-poor languages by projecting information from resource-rich languages using comparable corpora. An important intuition in such work is that syntactic information can be transferred with higher accuracy between languages if they are similar. In this paper, we compare typological distances derived from fifteen vector similarity measures with family internal classifications and also lexical divergence. These results are only a first step towards the use of WALS database in the projection of NLP resources for typologically or genetically similar, yet resource-poor languages.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the relation between structural diversity and geographical distance among languages: observations and computer simulations

Since the groundbreaking work of Nichols (1992) it has been clear that the use of typological databases for making inferences regarding linguistic prehistory could potentially have much to offer. The recent availability of larger typological databases such as Haspelmath et al. (2005) has brought the linguistics community closer to having a solid, empirical foundation for making actual claims ab...

متن کامل

Genealogical trees from genetic distances

Abstract In a population with haploid reproduction any individual has a single parent in the previous generation. If all genealogical distances among pairs of individuals (generations from the closest common ancestor) are known it is possible to exactly reconstruct their genealogical tree. Unfortunately, in most cases, genealogical distances are unknown and only genetic distances are available....

متن کامل

Discriminative Analysis of Linguistic Features for Typological Study

We address the task of automatically estimating the missing values of linguistic features by making use of the fact that some linguistic features in typological databases are informative to each other. The questions to address in this work are (i) how much predictive power do features have on the value of another feature? (ii) to what extent can we attribute this predictive power to genealogica...

متن کامل

Classifying Syntactic Regularities for Hundreds of Languages

This paper presents a comparison of classification methods for linguistic typology for the purpose of expanding an extensive, but sparse language resource: the World Atlas of Language Structures (WALS) (Dryer and Haspelmath, 2013). We experimented with a variety of regression and nearest-neighbor methods for use in classification over a set of 325 languages and six syntactic rules drawn from WA...

متن کامل

Contact-induced typological change

1. Introduction. It is easy to show that contact-induced change can have a profound effect on the typological profile of the receiving language. Probably the most obvious examples , and also the ones that are easiest to find, are changes in basic sentential word order. These are especially striking because it is word order features that have attracted the most attention in the typological liter...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012